SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Bujard, Hermann 

Gossen, Manfred 
Salfeld, Jochen G. 
Voss, Jeffrey W. 

(ii) TITLE OF INVENTION: Methods for Regulating Gene Expressi 

(iii) NUMBER OF SEQUENCES: 10 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Lahive & Cockfield 

(B) STREET: 60 State Street 

(C) CITY: Boston 

(D) STATE: Massachusetts 

(E) COUNTRY: USA 

(F) ZIP: 02109-1875 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: ASCII text 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/383,754 

(B) FILING DAE: 14-JUN-1994 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/076,327 

(B) FILING DAE: 14-JUN-1993 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: DeConti, Giulio A. , Jr. 

(B) REGISTRATION NUMBER: 31,503 

(C) REFERENCE/DOCKET NUMBER: BBI-013CP3 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 227-7400 

(B) TELEFAX: (617) 227-5941 



INFORMATION FOR SEQ ID NO : 1 : 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1008 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Herpes Simplex Virus 
10 (B) STRAIN: K12 , KOS 

(vii) IMMEDIATE SOURCE 

(B) CLONE: tTA transact ivator 

15 (ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..1008 

(ix) FEATURE: 
20 (A) NAME /KEY : mRNA 

□ (B) LOCATION: 1..1008 

jtj (ix) FEATURE: 

ffl (A) NAME/KEY: mi sc. binding 

j=* 25 (B) LOCATION: 1..207 

Sj (ix) FEATURE: 

: p (A) NAME / KEY : misc. binding 

(B) LOCATION: 208.. 335 

□ 30 

Uj (ix) FEATURE: 

|Vi (A) NAME /KEY : CDS 

S (B) LOCATION: 1..1005 

'% 35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ATG TCT AGA TTA GAT AAA AGT AAA GTG ATT AAC AGC GCA TTA GAG CTG 4 8 
Met Ser Arg Leu Asp Lys Ser Lys Val lie Asn Ser Ala Leu Glu Leu 
15 10 15 

40 CTT AAT GAG GTC GGA ATC GAA GGT TTA ACA ACC CGT AAA CTC GCC CAG 96 
Leu Asn Glu Val Gly lie Glu Gly Leu Thr Thr Arg Lys Leu Ala Gin 

20 25 30 

AAG CTA GGT GTA GAG CAG CCT ACA TTG ' TAT TGG CAT GTA AAA AAT AAG 144 
45 Lys Leu Gly Val Glu Gin Pro Thr Leu Tyr Trp His Val Lys Asn Lys 

35 40 45 

CGG GCT TTG CTC GAC GCC TTA GCC ATT GAG ATG TTA GAT AGG CAC CAT 192 
Arg Ala Leu Leu Asp Ala Leu Ala lie Glu Met Leu Asp Arg His His 
50 50 55 60 

ACT CAC TTT TGC CCT TTA GAA GGG GAA AGC TGG CAA GAT TTT TTA CGT 24 0 

Thr His Phe Cys Pro Leu Glu Gly Glu Ser Trp Gin Asp Phe Leu Arg 
65 70 75 80 



55 



AAT AAG GCT AAA AGT TTT AGA TGT GCT TTA CTA AGT CAT CGC GAT GGA 28 8 

Asn Lys Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asp Gly 

85 90 95 
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GCA AAA GTA CAT TTA GGT ACA CGG CCT ACA GAA AAA CAG TAT GAA ACT 336 
Ala Lys Val His Leu Gly Thr Arg Pro Thr Glu Lys Gin Tyr Glu Thr 

100 105 11° 

5 CTC GAA AAT CAA TTA GCC TTT TTA TGC CAA CAA GGT TTT TCA CTA GAG 384 
Leu Glu Asn Gin Leu Ala Phe Leu Cys Gin Gin Gly Phe Ser Leu Glu 
115 120 125 

10 AAT GCA TTA TAT GCA CTC AGC GCT GTG GGG CAT TTT ACT TTA GGT TGC 432 
Asn Ala Leu Tyr Ala Leu Ser Ala Val Gly His Phe Thr Leu Gly Cys 
130 135 140 

GTA TTG GAA GAT CAA GAG CAT CAA GTC GCT AAA GAA GAA AGG GAA ACA 480 • 
15 Val Leu Glu Asp Gin Glu His Gin Val Ala Lys Glu Glu Arg Glu Thr 
145 ISO 155 ISO 

CCT ACT ACT GAT AGT ATG CCG CCA TTA TTA CGA CAA GCT ATC GAA TTA 528 
Pro Thr Thr Asp Ser Met Pro Pro Leu Leu Arg Gin Ala He Glu Leu 
20 165 170 175 



'% TTT GAT CAC CAA GGT GCA GAG CCA GCC TTC TTA TTC GGC CTT GAA TTG 576 

Phe Asp His Gin Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu Leu 
15 180 185 190 



25 



ATC ATA TGC GGA TTA GAA AAA CAA CTT AAA TGT GAA AGT GGG TCC GCG 624 
He He Cys Gly Leu Glu Lys Gin Leu Lys Cys Glu Ser Gly Ser Ala 
19 5 200 205 



3 0 TAC AGC CGC GCG CGT ACG AAA AAC AAT TAC GGG TCT ACC ATC GAG GGC 672 
i Tyr Ser Arg Ala Arg Thr Lys Asn Asn Tyr Gly Ser Thr He Glu Gly 

y 210 215 22 0 



CTG CTC GAT CTC CCG GAC GAC GAC GCC CCC GAA GAG GCG GGG CTG GCG 72 0 
35 Leu Leu Asp Leu Pro Asp Asp Asp Ala Pro Glu Glu Ala Gly Leu Ala 
225 230 235 240 

GCT CCG CGC CTG TCC TTT CTC CCC GCG GGA CAC ACG CGC AGA CTG TCG 768 
Ala Pro Arg Leu Ser Phe Leu Pro Ala Gly His Thr Arg Arg Leu Ser 
40 245 250 255 

ACG GCC CCC CCG ACC GAT GTC AGC CTG GGG GAC GAG CTC CAC TTA GAC 816 
Thr Ala Pro Pro Thr Asp Val Ser Leu Gly Asp Glu Leu His Leu Asp 

260 265 270 

GGC GAG GAC GTG GCG ATG GCG CAT GCC GAC GCG CTA GAC GAT TTC GAT 864 
Gly Glu Asp Val Ala Met Ala His Ala Asp Ala Leu Asp Asp Phe Asp 
275 280 285 

50 CTG GAC ATG TTG GGG GAC GGG GAT TCC CCG GGT CCG GGA TTT ACC CCC 912 
Leu Asp Met Leu Gly Asp Gly Asp Ser Pro Gly Pro Gly Phe Thr Pro 
290 295 300 

CAC GAC TCC GCC CCC TAC GGC GCT CTG GAT ATG GCC GAC TTC GAG TTT 960 
55 His Asp Ser Ala Pro Tyr Gly Ala Leu Asp Met Ala Asp Phe Glu Phe 
305 310 315 320 

GAG CAG ATG TTT ACC GAT CCC CTT GGA ATT GAC GAG TAC GGT GGG TAG 1008 
Glu Gin Met Phe Thr Asp Pro Leu Gly He Asp Glu Tyr Gly Gly 
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325 330 335 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 335 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

10 Met Ser Arg Leu Asp Lys Ser Lys Val lie Asn Ser Ala Leu Glu Leu 
15 10 15 

Leu Asn Glu Val Gly lie Glu Gly Leu Thr Thr Arg Lys Leu Ala Gin 

20 25 30 

15 

Lys Leu Gly Val Glu Gin Pro Thr Leu Tyr Trp His Val Lys Asn Lys 
3 35 40 45 

fU Arg Ala Leu Leu Asp Ala Leu Ala lie Glu Met Leu Asp Arg His His 

130 20 50 55 60 

| c. 

1 " "!| V F 

[p Thr His Phe Cys Pro Leu Glu Gly Glu Ser Trp Gin Asp Phe Leu Arg 

~| 65 70 75 80 

» 25 Asn Lys Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asp Gly 

85 90 95 

Ala Lys Val His Leu Gly Thr Arg Pro Thr Glu Lys Gin Tyr Glu Thr 

100 105 110 

30 

Leu Glu Asn Gin Leu Ala Phe Leu Cys Gin Gin Gly Phe Ser Leu Glu 
115 120 125 

Asn Ala Leu Tyr Ala Leu Ser Ala Val Gly His Phe Thr Leu Gly Cys 
35 130 135 140 

Val Leu Glu Asp Gin Glu His Gin Val Ala Lys Glu Glu Arg Glu Thr 
145 150 155 160 

40 Pro Thr Thr Asp Ser Met Pro Pro Leu Leu Arg Gin Ala lie Glu Leu 

165 170 175 

Phe Asp His Gin Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu Leu 

180 185 190 

45 

lie lie Cys Gly Leu Glu Lys Gin Leu Lys Cys Glu Ser Gly Ser Ala 
195 200 205 

Tyr Ser Arg Ala Arg Thr Lys Asn Asn Tyr Gly Ser Thr lie Glu Gly 
50 210 215 220 

Leu Leu Asp Leu Pro Asp Asp Asp Ala Pro Glu Glu Ala Gly Leu Ala 
225 230 235 240 



55 Ala Pro Arg Leu Ser Phe Leu Pro Ala Gly His Thr Arg Arg Leu Ser 



5 



245 



250 255 



Thr Ala Pro Pro Thr Asp Val Ser Leu Gly Asp Glu Leu His Leu Asp 
260 265 270 

Gly Glu Asp Val Ala Met Ala His Ala Asp Ala Leu Asp Asp Phe Asp 

275 280 285 



Leu Asp Met Leu Gly Asp Gly Asp Ser Pro Gly Pro Gly Phe Thr Pro 
10 290 295 300 

His Asp Ser Ala Pro Tyr Gly Ala Leu Asp Met Ala Asp Phe Glu Phe 
305 310 315 320 

15 Glu Gin Met Phe Thr Asp Pro Leu Gly He Asp Glu Tyr Gly Gly 

325 330 335 



(2) INFORMATION FOR SEQ ID NO : 3 : 

3 20 (i) SEQUENCE CHARACTERISTICS: 

°~ (A) LENGTH: 894 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
i (D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: DNA (genomic) 



: a S 
ffl 



• 35 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Herpes Simplex Virus 
30 (B) STRAIN: K12 , KOS 

(C) INDIVIDUAL ISOLATE: tTAg transactivator 

(ix) FEATURE: 

(A) NAME /KEY : exon 
35 (B) LOCATION: 1..894 

(ix) FEATURE: 

(A) NAME / KEY : mRNA 

(B) LOCATION: 1..894 

40 

(ix) FEATURE: 

(A) NAME /KEY : misc . binding 

(B) LOCATION: 1. .207 

45 (ix) FEATURE: 

(A) NAME / KEY : misc . binding 

(B) LOCATION: 208.. 297 

(ix) FEATURE: 
50 (A) NAME /KEY: CDS 

(B) LOCATION: 1..891 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 



ATG TCT AGA TTA GAT AAA AGT AAA GTG ATT AAC AGC GCA TTA GAG CTG 
55 Met Ser Arg Leu Asp Lys Ser Lys Val lie Asn Ser Ala Leu Glu Leu 



1 



5 



10 



15 



CTT AAT GAG GTC GGA ATC GAA GGT TTA ACA ACC CGT AAA CTC GCC CAG 96 
Leu Asn Glu Val Gly lie Glu Gly Leu Thr Thr Arg Lys Leu Ala Gin 

20 25 30 

5 AAG CTA GGT GTA GAG CAG CCT ACA TTG TAT TGG CAT GTA AAA AAT AAG 144 
Lys Leu Gly Val Glu Gin Pro Thr Leu Tyr Trp His Val Lys Asn Lys 
35 40 45 

CGG GCT TTG CTC GAC GCC TTA GCC ATT GAG ATG TTA GAT AGG CAC CAT 192 
Arg Ala Leu Leu Asp Ala Leu Ala He Glu Met Leu Asp Arg His His 
10 50 55 60 

ACT CAC TTT TGC CCT TTA GAA GGG GAA AGC TGG CAA GAT TTT TTA CGT 24 0 
Thr His Phe Cys Pro Leu Glu Gly Glu Ser Trp Gin Asp Phe Leu Arg 
65 70 75 80 

AAT AAC GCT AAA AGT TTT AGA TGT GCT TTA CTA AGT CAT CGC GAT GGA 288 
15 Asn Asn Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asp Gly 
□ 85 90 95 

\i GCA AAA GTA CAT TTA GGT ACA CGG CCT ACA GAA AAA CAG TAT GAA ACT 336 

i Ala Lys Val His Leu Gly Thr Arg Pro Thr Glu Lys Gin Tyr Glu Thr 

100 105 HO 



2 0 CTC GAA AAT CAA TTA GCC TTT TTA TGC CAA CAA GGT TTT TCA CTA GAG 3 84 

Leu Glu Asn Gin Leu Ala Phe Leu Cys Gin Gin Gly Phe Ser Leu Glu 
115 120 125 

AAT GCA TTA TAT GCA CTC AGC GCT GTG GGG CAT TTT ACT TTA GGT TGC 432 
Asn Ala Leu Tyr Ala Leu Ser Ala Val Gly His Phe Thr Leu Gly Cys 
25 130 135 140 

GTA TTG GAA GAT CAA GAG CAT CAA GTC GCT AAA GAA GAA AGG GAA ACA 48 0 

Val Leu Glu Asp Gin Glu His Gin Val Ala Lys Glu Glu Arg Glu Thr 
145 150 155 160 

CCT ACT ACT GAT AGT ATG CCG CCA TTA TTA CGA CAA GCT ATC GAA TTA 52 8 

3 0 Pro Thr Thr Asp Ser Met Pro Pro Leu Leu Arg Gin Ala He Glu Leu 

165 170 175 

TTT GAT CAC CAA GGT GCA GAG CCA GCC TTC TTA TTC GGC CTT GAA TTG 576 
Phe Asp His Gin Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu Leu 

180 185 190 

3 5 ATC ATA TGC GGA TTA GAA AAA CAA CTT AAA TGT GAA AGT GGG TCT GAT 624 
He He Cys Gly Leu Glu Lys Gin Leu Lys Cys Glu Ser Gly Ser Asp 
195 200 205 

CCA TCG ATA CAC ACG CGC AGA CTG TCG ACG GCC CCC CCG ACC GAT GTC 6 72 

Pro Ser He His Thr Arg Arg Leu Ser Thr Ala Pro Pro Thr Asp Val 
40 210 215 220 

AGC CTG GGG GAC GAG CTC CAC TTA GAC GGC GAG GAC GTG GCG ATG GCG 720 
Ser Leu Gly Asp Glu Leu His Leu Asp Gly Glu Asp Val Ala Met Ala 
225 230 235 240 

CAT GCC GAC GCG CTA GAC GAT TTC GAT CTG GAC ATG TTG GGG GAC GGG 768 
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5 



His Ala Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Gly Asp Gly 

245 250 255 

GAT TCC CCG GGT CCG GGA TTT ACC CCC CAC GAC TCC GCC CCC TAC GGC 816 
Asp Ser Pro Gly Pro Gly Phe Thr Pro His Asp Ser Ala Pro Tyr Gly 

260 265 270 

GCT CTG GAT ATG GCC GAC TTC GAG TTT GAG CAG ATG TTT ACC GAT GCC 864 
Ala Leu Asp Met Ala Asp Phe Glu Phe Glu Gin Met Phe Thr Asp Ala 
275 280 285 

CTT GGA ATT GAC GAG TAC GGT GGG TTC TAG 894 
10 Leu Gly lie Asp Glu Tyr Gly Gly Phe 
290 295 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 97 amino acids 
15 (B) TYPE: amino acid 

(D) TOPOLOGY : linear 



(ii) MOLECULE TYPE: protein 
m (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 



20 



Met Ser Arg Leu Asp Lys Ser Lys Val He Asn Ser Ala Leu Glu Leu 
x 5 10 15 

Leu Asn Glu Val Gly He Glu Gly Leu Thr Thr Arg Lys Leu Ala Gin 

20 25 30 

2 5 Lys Leu Gly Val Glu Gin Pro Thr Leu Tyr Trp His Val Lys Asn Lys 

35 40 45 

Arg Ala Leu Leu Asp Ala Leu Ala He Glu Met Leu Asp Arg His His 
50 55 60 

30 

Thr His Phe Cys Pro Leu Glu Gly Glu Ser Trp Gin Asp Phe Leu Arg 
65 70 75 80 

Asn Asn Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asp Gly 
35 85 90 95 

Ala Lys Val His Leu Gly Thr Arg Pro Thr Glu Lys Gin Tyr Glu Thr 

100 105 HO 

40 Leu Glu Asn Gin Leu Ala Phe Leu Cys Gin Gin Gly Phe Ser Leu Glu 

115 120 125 

Asn Ala Leu Tyr Ala Leu Ser Ala Val Gly His Phe Thr Leu Gly Cys 
130 135 140 



45 



50 



Val Leu Glu Asp Gin Glu His Gin Val Ala Lys Glu Glu Arg Glu Thr 
145 150 155 160 

Pro Thr Thr Asp Ser Met Pro Pro Leu Leu Arg Gin Ala lie Glu Leu 

165 170 175 



10 



15 



20 



25 



35 
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Phe Asp His Gin Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu Leu 

180 185 190 

lie lie Cys Gly Leu Glu Lys Gin Leu Lys Cys Glu Ser Gly Ser Asp 
195 200 205 

Pro Ser lie His Thr Arg Arg Leu Ser Thr Ala Pro Pro Thr Asp Val 
210 215 220 

Ser Leu Gly Asp Glu Leu His Leu Asp Gly Glu Asp Val Ala Met Ala 
225 230 235 240 

His Ala Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Gly Asp Gly 

245 250 255 

Asp Ser Pro Gly Pro Gly Phe Thr Pro His Asp Ser Ala Pro Tyr Gly 

260 265 270 

Ala Leu Asp Met Ala Asp Phe Glu Phe Glu Gin Met Phe Thr Asp Ala 
275 280 285 

Leu Gly He Asp Glu Tyr Gly Gly Phe 
290 295 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 50 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Human cytomegalovirus 

(B) STRAIN: K12 , Towne 

40 (ix) FEATURE: 

(A) NAME/KEY: mRNA 

(B) LOCATION: 382.. 450 

i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

45 GAATTCCTCG AGTTTACCAC TCCCTATCAG TGATAGAGAA AAGTGAAAGT CGAGTTTACC 6 0 

ACTCCCTATC AGTGATAGAG AAAAGTGAAA GTCGAGTTTA CCACTCCCTA TCAGTGATAG 12 0 

AGAAAAGTGA AAGTCGAGTT TACCACTCCC TATCAGTGAT AGAGAAAAGT GAAAGTCGAG 18 0 

TTTACCACTC CCTATCAGTG ATAGAGAAAA GTGAAAGTCG AGTTTACCAC TCCCTATCAG 24 0 

TGATAGAGAA AAGTGAAAGT CGAGTTTACC ACTCCCTATC AGTGATAGAG AAAAGTGAAA 300 



50 GTCGAGCTCG GTACCCGGGT CGAGTAGGCG TGTACGGTGG GAGGCCTATA TAAGCAGAGC 



360 
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TCGTTTAGTG AACCGTCAGA TCGCCTGGAG ACGCCATCCA CGCTGTTTTG ACCTCCATAG 42 0 



AAGACACCGG GACCGATCCA GCCTCCGCGG 
(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 4 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Human cytomegalovirus 

(B) STRAIN: Towne 



15 



40 



(ix) FEATURE: 

(A) NAME /KEY: mRNA 

(B) LOCATION: 382.. 450 



(2) INFORMATION FOR SEQ ID NO: 7: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 98 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Herpes Simplex Virus 

(B) STRAIN: KOS 



450 





20 


(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 6 












GAATTCCTCG 


ACCCGGGTAC 


CGAGCTCGAC 


TTTCACTTTT 


CTCTATCACT 


GATAGGGAGT 


60 






GGTAAACTCG 


ACTTTCACTT 


TTCTCTATCA 


CTGATAGGGA 


GTGGTAAACT 


CGACTTTCAC 


120 






TTTTCTCTAT 


CACTGATAGG 


GAGTGGTAAA 


CTCGACTTTC 


ACTTTTCTCT 


ATCACTGATA 


180 






GGGAGTGGTA 


AACTCGACTT 


TCACTTTTCT 


CTATCACTGA 


TAGGGAGTGG 


TAAACTCGAC 


240 




25 


TTTCACTTTT 


CTCTATCACT 


GATAGGGAGT 


GGTAAACTCG 


ACTTTCACTT 


TTCTCTATCA 


300 






CTGATAGGGA 


GTGGTAAACT 


CGAGTAGGCG 


TGTACGGTGG 


GAGGC CTATA 


TAAGCAGAGC 


360 






TCGTTTAGTG 


AACCGTCAGA 


TCGCCTGGAG 


ACGCCATCCA 


CGCTGTTTTG 


ACCTCCATAG 


420 






AAGACACCGG 


GACCGATCCA 


GCCTCCGCGG 








450 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 
GAGCTCGACT TTCACTTTTC TCTATCACTG ATAGGGAGTG GTAAACTCGA CTTTCACTTT 60 



10 



15 



" 2 0 



25 



30 





-61 - 

TCTCTATCAC TGATAGGGAG TGGTAAACTC GACTTTCACT TTTCTCTATC ACTGATAGGG 
AGTGGTAAAC TCGACTTTCA CTTTTCTCTA TCACTGATAG GGAGTGGTAA ACTCGACTTT 
CACTTTTCTC TATCACTGAT AGGGAGTGGT AAACTCGACT TTCACTTTTC TCTATCACTG 
ATAGGGAGTG GTAAACTCGA CTTTCACTTT TCTCTATCAC TGATAGGGAG TGGTAAACTC 
GAGATCCGGC GAATTCGAAC ACGCAGATGC AGTCGGGGCG GCGCGGTCCG AGGTCCACTT 
CGCATATTAA GGTGACGCGT GTGGCCTCGA ACACCGAG 
(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6244 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Human cytomegalovirus 

(B) STRAIN: Towne (hCMV) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: pUHD BGR3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 



120 
180 
240 
300 
360 
398 



CTCGAGTTTA 


CCACTCCCTA 


TCAGTGATAG 


AGAAAAGTGA 


AAGTCGAGTT 


TACCACTCCC 


60 


TATCAGTGAT 


AGAGAAAAGT 


GAAAGTCGAG 


TTTACCACTC 


CCTATCAGTG 


ATAGAGAAAA 


120 


GTGAAAGTCG 


AGTTTACCAC 


TCCCTATCAG 


TGATAGAGAA 


AAGTGAAAGT 


CGAGTTTACC 


180 


ACTCCCTATC 


AGTGATAGAG 


AAAAGTGAAA 


GTCGAGTTTA 


CCACTCCCTA 


TCAGTGATAG 


240 


AGAAAAGTGA 


AAGTCGAGTT 


TACCACTCCC 


TATCAGTGAT 


AGAGAAAAGT 


GAAAGTCGAG 


300 


CTCGGTACCC 


GGGTCGAGTA 


GGCGTGTACG 


GTGGGAGGCC 


TATATAAGCA 


GAGCTCGTTT 


360 


AGTGAACCGT 


CAGATCGCCT 


GGAGACGCCA 


TCCACGCTGT 


TTTGACCTCC 


ATAGAAGACA 


420 


CCGGGACCGA 


TCCAGCCTCC 


GCGGCCCCGA 


ATTCGAGCTC 


GGTACCGGGC 


CCCCCCTCGA 


480 


GGTCGACGGT 


ATCGATAAGC 


TTGATATCGA 


ATTCCAGGAG 


GTGGAGATCC 


GCGGGTCCAG 


540 


CCAAACCCCA 


CACCCATTTT 


CTCCTCCCTC 


TGCCCCTATA 


TCCCGGCACC 


CCCTCCTCCT 


600 


AGCCCTTTCC 


CTCCTCCCGA 


GAGACGGGGG 


AGGAGAAAAG 


GGGAGTTCAG 


GTCGACATGA 


660 


CTGAGCTGAA 


GGCAAAGGAA 


CCTCGGGCTC 


CCCACGTGGC 


GGGCGGCGCG 


CCCTCCCCCA 


720 


CCGAGGTCGG 


ATCCCAGCTC 


CTGGGTCGCC 


CGGACCCTGG 


CCCCTTCCAG 


GGGAGCCAGA 


780 


CCTCAGAGGC 


CTCGTCTGTA 


GTCTCCGCCA 


TCCCCATCTC 


CCTGGACGGG 


TTGCTCTTCC 


840 



CCCGGCCCTG TCAGGGGCAG AACCCCCCAG ACGGGAAGAC GCAGGACCCA CCGTCGTTGT 90 0 

CAGACGTGGA GGGCGCATTT CCTGGAGTCG AAGCCCCGGA GGGGGCAGGA GACAGCAGCT 96 0 

CGAGACCTCC AGAAAAGGAC AGCGGCCTGC TGGACAGTGT CCTCGACACG CTCCTGGCGC 102 0 

CCTCGGGTCC CGGGCAGAGC CACGCCAGCC CTGCCACCTG CGAGGCCATC AGCCCGTGGT 1080 

5 GCCTGTTTGG CCCCGACCTT CCCGAAGACC CCCGGGCTGC CCCCGCTACC AAAGGGGTGT 1140 

TGGCCCCGCT CATGAGCCGA CCCGAGGACA AGGCAGGCGA CAGCTCTGGG ACGGCAGCGG 12 00 

CCCACAAGGT GCTGCCCAGG GGACTGTCAC CATCCAGGCA GCTGCTGCTC CCCTCCTCTG 12 60 

GGAGCCCTCA CTGGCCGGCA GTGAAGCCAT CCCCGCAGCC CGCTGCGGTG CAGGTAGACG 1320 

AGGAGGACAG CTCCGAATCC GAGGGCACCG TGGGCCCGCT CCTGAAGGGC CAACCTCGGG 138 0 

10 CACTGGGAGG CACGGCGGCC GGAGGAGGAG CTGCCCCCGT CGCGTCTGGA GCGGCCGCAG 144 0 

GAGGCGTCGC CCTTGTCCCC AAGGAAGATT CTCGCTTCTC GGCGCCCAGG GTCTCCTTGG 1500 

CGGAGCAGGA CGCGCCGGTG GCGCCTGGGC GCTCCCCGCT GGCCACCTCG GTGGTGGATT 156 0 

TCATCCACGT GCCCATCCTG CCTCTCAACC ACGCTTTCCT GGCCACCCGC ACCAGGCAGC 162 0 

TGCTGGAGGG GGAGAGCTAC GACGGCGGGG CCGCGGCCGC CAGCCCCTTC GTCCCGCAGC 16 8 0 

15 GGGGCTCCCC CTCTGCCTCG TCCACCCCTG TGGCGGGCGG CGACTTCCCC GACTGCACCT 1740 

ACCCGCCCGA CGCCGAGCCC AAAGATGACG CGTTCCCCCT CTACGGCGAC TTCCAGCCGC 1800 

CCGCCCTCAA GATAAAGGAG GAGGAAGAAG CCGCCGAGGC CGCGGCGCGC TCCCCGCGTA 1860 

CGTACCTGGT GGCTGGTGCA AACCCCGCCG CCTTCCCGGA CTTCCAGCTG GCAGCGCCGC 192 0 

CGCCACCCTC GCTGCCGCCT CGAGTGCCCT CGTCCAGACC CGGGGAAGCG GCGGTGGCGG 198 0 

2 0 CCTCCCCAGG CAGTGCCTCC GTCTCCTCCT CGTCCTCGTC GGGGTCGACC CTGGAGTGCA 2 04 0 

TCCTGTACAA GGCAGAAGGC GCGCCGCCCC AGCAGGGCCC CTTCGCGCCG CTGCCCTGCA 2100 

AGCCTCCGGG CGCCGGCGCC TGCCTGCTCC CGCGGGACGG CCTGCCCTCC ACCTCCGCCT 2160 

CGGGCGCAGC CGCCGGGGCC GCCCCTGCGC TCTACCCGAC GCTCGGCCTC AACGGACTCC 2220 

CGCAACTCGG CTACCAGGCC GC CGTGCTC A AGGAGGGCCT GCCGCAGGTC TACACGCCCT 2280 

2 5 ATCTCAACTA CCTGAGGCCG GATTCAGAAG CCAGTCAGAG CCCACAGTAC AGCTTCGAGT 234 0 

CACTACCTCA GAAGATTTGT TTGATCTGTG GGGATGAAGC ATCAGGCTGT CATTATGGTG 24 00 

TCCTCACCTG TGGGAGCTGT AAGGTCTTCT TTAAAAGGGC AATGGAAGGG CAGCATAACT 2460 

ATTTATGTGC TGGAAGAAAT GACTGCATTG TTGATAAAAT CCGCAGGAAA AACTGCCCGG 2 52 0 

CGTGTCGCCT TAGAAAGTGC TGTCAAGCTG GCATGGTCCT TGGAGGGCGA AAGTTTAAAA 25 8 0 

30 AGTTCAATAA AGTCAGAGTC ATGAGAGCAC TCGATGCTGT TGCTCTCCCA CAGCCAGTGG 264 0 



GCATTCCAAA TGAAAGCCAA CGAATCACTT TTTCTCCAAG TCAAGAGATA CAGTTAATTC 2 700 

CCCCTCTAAT CAACCTGTTA ATGAGCATTG AACCAGATGT GATCTATGCA GGACATGACA 2760 

ACACAAAGCC TGATACCTCC AGTTCTTTGC TGACGAGTCT TAATCAACTA GGCGAGCGGC 2820 

AACTTCTTTC AGTGGTAAAA TGGTCCAAAT CTCTTCCAGG TTTTCGAAAC TTACATATTG 2880 

5 ATGACCAGAT AACTCTCATC CAGTATTCTT GGATGAGTTT AATGGTATTT GGACTAGGAT 2 940 

GGAGATCCTA CAAACATGTC AGTGGGCAGA TGCTGTATTT TGCACCTGAT CTAATATTAA 3000 

ATGAACAGCG GATGAAAGAA TCATCATTCT ATTCACTATG CCTTACCATG TGGCAGATAC 3060 

CGCAGGAGTT TGTCAAGCTT CAAGTTAGCC AAGAAGAGTT CCTCTGCATG AAAGTATTAC 312 0 

TACTTCTTAA TACAATTCCT TTGGAAGGAC TAAGAAGTCA AAGCCAGTTT GAAGAGATGA 318 0 

10 GATCAAGCTA CATTAGAGAG CTCATCAAGG CAATTGGTTT GAGGCAAAAA GGAGTTGTTT 324 0 

CCAGCTCACA GCGTTTCTAT CAGCTCACAA AACTTCTTGA TAACTTGCAT GATCTTGTCA 3300 

AACAACTTCA CCTGTACTGC CTGAATACAT TTATCCAGTC CCGGGCGCTG AGTGTTGAAT 3360 

TTCCAGAAAT GATGTCTGAA GTTATTGCTG CACAGTTACC CAAGATATTG GCAGGGATGG 3420 

TGAAACCACT TCTCTTTCAT AAAAAGTGAA TGTCAATTAT TTTTCAAAGA ATTAAGTGTT 34 80 

15 GTGGTATGTC TTTCGTTTTG GTCAGGATTA TGACGTCTCG AGTTTTTATA ATATTCTGAA 3 54 0 

AGGGAATTCC TGCAGCCCGG GGGATCCACT AGTTCTAGAG GATCCAGACA TGATAAGATA 36 00 

CATTGATGAG TTTGGACAAA CCACAACTAG AATGCAGTGA AAAAAATGCT TTATTTGTGA 366 0 

AATTTGTGAT GCTATTGCTT TATTTGTAAC CATTATAAGC TGCAATAAAC AAGTTAACAA 3 72 0 

CAACAATTGC ATTCATTTTA TGTTTCAGGT TCAGGGGGAG GTGTGGGAGG TTTTTTAAAG 3 78 0 

2 0 CAAGTAAAAC CTCTACAAAT GTGGTATGGC TGATTATGAT CCTGCAAGCC TCGTCGTCTG 384 0 

GCCGGACCAC GCTATCTGTG CAAGGTCCCC GGACGCGCGC TCCATGAGCA GAGCGCCCGC 3 900 

CGCCGAGGCA AGACTCGGGC GGCGCCCTGC CCGTCCCACC AGGTCAACAG GCGGTAACCG 396 0 

GCCTCTTCAT CGGGAATGCG CGCGACCTTC AGCATCGCCG GCATGTCCCC TGGCGGACGG 4020 

GAAGTATCAG CTCGACCAAG CTTGGCGAGA TTTTCAGGAG CTAAGGAAGC TAAAATGGAG 4080 

25 AAAAAAATCA CTGGATATAC CACCGTTGAT ATATCC CAAT GGCATCGTAA AGAACATTTT 414 0 
GAGGCATTTC AGTCAGTTGC TCAATGTACC TATAACCAGA CCGTTCAGCT GCATTAATGA 4200 
ATCGGCCAAC GCGCGGGGAG AGGCGGTTTG CGTATTGGGC GCTCTTCCGC TTCCTCGCTC 426 0 

ACTGACTCGC TGCGCTCGGT CGTTCGGCTG CGGCGAGCGG TATCAGCTCA CTCAAAGGCG 432 0 

GTAATACGGT TATCCACAGA ATCAGGGGAT AACGCAGGAA AGAACATGTG AGCAAAAGGC 4380 
3 0 CAGCAAAAGG CCAGGAACCG TAAAAAGGCC GCGTTGCTGG CGTTTTTCCA TAGGCTCCGC 444 0 







CCCCCTGACG 


AGCATCACAA 


AAATCGACGC 


TCAAGTCAGA 


GGTGGCGAAA 


CCCGACAGGA 


4500 


* 




CTATAAAGAT 


ACCAGGCGTT 


TCCCCCTGGA 


AGCTCCCTCG 


TGCGCTCTCC 


TGTTCCGACC 


4560 






CTGCCGCTTA 


CCGGATACCT 


GTCCGCCTTT 


CTCCCTTCGG 


GAAGCGTGGC 


CCTTTCTCAA 


4620 






TGCTCACGCT 


GTAGGTATCT 


CAGTTCGGTG 


TAGGTCGTTC 


GCTCCAAGCT 


GGGCTGTGTG 


4680 




5 


CACGAACCCC 


CCGTTCAGCC 


CGACCGCTGC 


GCCTTATCCG 


GTAACTATCG 


TCTTGAGTCC 


4740 






AACCCGGTAA 


GACACGACTT 


ATCGCCACTG 


GCAGCAGCCA 


CTGGTAACAG 


GATTAGCAGA 


4800 






GCGAGGTATG 


TAGGCGGTGC 


TACAGAGTTC 


TTGAAGTGGT 


GGCCTAACTA 


CGGCTACACT 


4860 






AGAAGGACAG 


TATTTGGTAT 


CTGCGCTCTG 


CTGAAGCCAG 


TTACCTTCGG 


AAAAAGAGTT 


4920 






GGTAGCTCTT 


GATCCGGCAA 


ACAAACCACC 


GCTGGTAGCG 


GTGGTTTTTT 


TGTTTGCAAG 


4980 




10 


CAGCAGATTA 


CGCGCAGAAA 


AAAAGGATCT 


CAAGAAGATC 


CTTTGATCTT 


TTCTACGGGG 


5040 






TCTGACGCTC 


AGTGGAACGA 


AAACTCACGT 


TAAGGGATTT 


TGGTCATGAG 


ATTATCAAAA 


5100 


i y 




AGGATCTTCA 


CCTAGATCCT 


TTTAAATTAA 


AAATGAAGTT 


TTAAATCAAT 


CTAAAGTATA 


5160 


1== 




TATGAGTAAA 


CTTGGTCTGA 


CAGTTACCAA 


TGCTTAATCA 


GTGAGGCACC 


TATCTCAGCG 


5220 






ATCTGTCTAT 


TTCGTTCATC 


CATAGTTGCC 


TGACTCCCCG 


TCGTGTAGAT 


AACTACGATA 


5280 


* — 


15 


CGGGAGGGCT 


TAC CATCTGG 


CCCCAGTGCT 


GCAATGATAC 


CGCGAGACCC 


ACGCTCACCG 


5340 






GCTCCAGATT 


TATCAGCAAT 


AAACCAGCCA 


GCCGGAAGGG 


CCGAGCGCAG 


AAGTGGTCCT 


5400 






GCAACTTTAT 


CCGCCTCCAT 


CCAGTCTATT 


AATTGTTGCC 


GGGAAGCTAG 


AGTAAGTAGT 


5460 






TCGCCAGTTA 


ATAGTTTGCG 


CAACGTTGTT 


GCCATTGCTA 


CAGGCATCGT 


GGTGTCACGC 


5520 






TCGTCGTTTG 


GTATGGCTTC 


ATTCAGCTCC 


GGTTCCCAAC 


GATCAAGGCG 


AGTTACATGA 


5580 




20 


TCCCCCATGT 


TGTGCAAAAA 


AGCGGTTAGC 


TCCTTCGGTC 


CTCCGATCGT 


TGTCAGAAGT 


5640 






AAGTTGGCCG 


CAGTGTTATC 


ACTCATGGTT 


ATGGCAGCAC 


TGCATAATTC 


TCTTACTGTC 


5700 






ATGCCATCCG 


TAAGATGCTT 


TTCTGTGACT 


GGTGAGTACT 


CAACCAAGTC 


ATTCTGAGAA 


5760 






TAGTGTATGC 


GGCGACCGAG 


TTGCTCTTGC 


CCGGCGTCAA 


TACGGGATAA 


TACCGCGCCA 


5820 






CATAGCAGAA 


CTTTAAAAGT 


GCTCATCATT 


GGAAAACGTT 


CTTCGGGGCG 


AAAACTCTCA 


5880 




25 


AGGATCTTAC 


CGCTGTTGAG 


ATCCAGTTCG 


ATGTAACCCA 


CTCGTGCACC 


CAACTGATCT 


5940 






TCAGCATCTT 


TTACTTTCAC 


CAGCGTTTCT 


GGGTGAGCAA 


AAACAGGAAG 


GCAAAATGCC 


6000 






GCAAAAAAGG 


GAATAAGGGC 


GACACGGAAA 


TGTTGAATAC 


TCATACTCTT 


CCTTTTTCAA 


6060 






TATTATTGAA 


GCATTTATCA 


GGGTTATTGT 


CTCATGAGCG 


GATACATATT 


TGAATGTATT 


6120 






TAGAAAAATA 


AACAAATAGG 


GGTTCCGCGC 


ACATTTCCCC 


GAAAAGTGCC 


ACCTGACGTC 


6180 




30 


TAAGAAACCA 


TTATTATCAT 


GACATTAACC 


TATAAAAATA 


GGCGTATCAC 


GAGGCCCTTT 


6240 



CGTC 6244 
(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4963 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

10 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Human cytomegalovirus 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: pUHD BGR4 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

CTCGAGTTTA CCACTCCCTA TCAGTGATAG AGAAAAGTGA AAGTCGAGTT TACCACTCCC 6 0 

TATCAGTGAT AGAGAAAAGT GAAAGTCGAG TTTACCACTC CCTATCAGTG ATAGAGAAAA 12 0 

GTGAAAGTCG AGTTTAC CAC TCCCTATCAG TGATAGAGAA AAGTGAAAGT CGAGTTTACC 180 

ACTCCCTATC AGTGATAGAG AAAAGTGAAA GTCGAGTTTA CCACTCCCTA TCAGTGATAG 24 0 

20 AGAAAAGTGA AAGTCGAGTT TACCACTCCC TATCAGTGAT AGAGAAAAGT GAAAGTCGAG 3 00 

CTCGGTACCC GGGTCGAGTA GGCGTGTACG GTGGGAGGCC TATATAAGCA GAGCTCGTTT 360 

AGTGAACCGT CAGATCGCCT GGAGACGCCA TCCACGCTGT TTTGACCTCC ATAGAAGACA 42 0 

CCGGGACCGA TCCAGCCTCC GCGGCCCCGA ATTCCGGCCA CGACCATGAC CATGACCCTC 48 0 

CACACCAAAG CATCTGGGAT GGCCCTACTG CATCAGATCC AAGGGAACGA GCTGGAGCCC 54 0 

25 CTGAACCGTC CGCAGCTCAA GATCCCCCTG GAGCGGCCCC TGGGCGAGGT GTACCTGGAC 6 00 

AGCAGCAAGC CCGCCGTGTA CAACTACCCC GAGGGCGCCG C CTACGAGTT CAACGCCGCG 66 0 

GCCGCCGCCA ACGCGCAGGT CTACGGTCAG ACCGGCCTCC CCTACGGCCC CGGGTCTGAG 72 0 

GCTGCGGCGT TCGGCTCCAA CGGCCTGGGG GGTTTCCCCC CACTCAACAG CGTGTCTCCG 780 

AGCCCGCTGA TGCTACTGCA CCCGCCGCCG CAGCTGTCGC CTTTCCTGCA GCCCCACGGC 84 0 

30 CAGCAGGTGC CCTACTACCT GGAGAACGAG CCCAGCGGCT ACACGGTGCG CGAGGCCGGC 900 

CCGCCGGCAT TCTACAGGCC AAATTCAGAT AATCGACGCC AGGGTGGCAG AGAAAGATTG 960 

GCCAGTACCA ATGACAAGGG AAGTATGGCT ATGGAATCTG CCAAGGAGAC TCGCTACTGT 102 0 

GCAGTGTGCA ATGACTATGC TTCAGGCTAC CATTATGGAG TCTGGTCCTG TGAGGGCTGC 1080 

AAGGCCTTCT TCAAGAGAAG TATTCAAGGA CATAACGACT ATATGTGTCC AGCCACCAAC 114 0 

35 CAGTGCACCA TTGATAAAAA CAGGAGGAAG AGCTGCCAGG CCTGCCGGCT CCGCAAATGC 1200 



TACGAAGTGG GAATGATGAA AGGTGGGATA 
AAACACAAGC GCCAGAGAGA TGATGGGGAG 
ATGAGAGCTG CCAACCTTTG GCCAAGCCCG 
CTGGCCTTGT CCCTGACGGC CGACCAGATG 
5 ATACTCTATT CCGAGTATGA TCCTACCAGA 
CTGACCAACC TGGCAGACAG GGAGCTGGTT 
GGCTTTGTGG ATTTGACCCT CCATGATCAG 
ATCCTGATGA TTGGTCTCGT CTGGCGCTCC 
CCTAACTTGC TCTTGGACAG GAACCAGGGA 
10 GACATGCTGC TGGCTACATC ATCTCGGTTC 
GTGTGCCTCA AATCTATTAT TTTGCTTAAT 
CTGAAGTCTC TGGAAGAGAA GGACCATATC 
TTGATCCACC TGATGGCCAA GGCAGGCCTG 
CAGCTCCTCC TCATCCTCTC CCACATCAGG 
15 TACAGCATGA AGTGCAAGAA CGTGGTGCCC 
GCCCACCGCC TACATGCGCC CACTAGCCGT 
AGCCACTTGG CCACTGCGGG CTCTACTTCA 
GGGGAGGCAG AGGGTTTCCC TGCCACAGTC 
T AC C CGGGG A TCCTCTAGAG GATCCAGACA 
2 0 CCACAACTAG AATGCAGTGA AAAAAATGCT 
TATTTGTAAC CATTATAAGC TGCAATAAAC 
TGTTTCAGGT TCAGGGGGAG GTGTGGGAGG 
GTGGTATGGC TGATTATGAT CCTGCAAGCC 
CAAGGTCCCC GGACGCGCGC TCCATGAGCA 
25 GGCGCCCTGC CCGTCCCACC AGGTCAACAG 
CGCGACCTTC AGCATCGCCG GCATGTCCCC 
CTTGGCGAGA TTTTCAGGAG CTAAGGAAGC 
CACCGTTGAT ATATCCCAAT GGCATCGTAA 
TCAATGTACC TATAACCAGA CCGTTCAGCT 
3 0 AGGCGGTTTG CGTATTGGGC GCTCTTCCGC 



CGAAAAGACC G AAGAGGAGG GAGAATGTTG 1260 

GGCAGGGGTG AAGTGGGGTC TGCTGGAGAC 132 0 

CTCATGATCA AACGCTCTAA GAAGAACAGC 1380 

GTCATGGCCT TGTTGGATGC TGAGCCCCCC 144 0 

CCCTTCAGTG AAGCTTCGAT GATGGGCTTA 1500 

CACATGATCA ACTGGGCGAA GAGGGTGCCA 1560 

GTCCACCTTC TAGAATGTGC CTGGCTAGAG 1620 

ATGGAGCACC CAGTGAAGCT ACTGTTTGCT 1680 

AAATGTGTAG AGGGCATGGT GGAGATCTTC 174 0 

CGCATGATGA ATCTGCAGGG AGAGGAGTTT 1800 

TCTGGAGTGT ACACATTTCT GTCCAGCACC 186 0 

CACCGAGTCC TGGACAAGAT CACAGACACT 192 0 

ACCCTGCAGC AGCAGCACCA GCGGCTGGCC 198 0 

CACATGAGTA ACAAAGGCAT GGAGCATCTG 204 0 

CTCTATGACC TGCTGCTGGA GATGCTGGAC 210 0 

GGAGGGGCAT CCGTGGAGGA GACGGACCAA 216 0 

TCGCATTCCT TGCAAAAGTA TTACATCACG 222 0 

TGAGAGCTCC CTGGCGGAAT TCGAGCTCGG 22 8 0 

TGATAAGATA CATTGATGAG TTTGGACAAA 2 34 0 

TTATTTGTGA AATTTGTGAT GCTATTGCTT 24 00 

AAGTTAACAA CAACAATTGC ATTCATTTTA 24 6 0 

TTTTTTAAAG CAAGTAAAAC CTCTACAAAT 2 52 0 

TCGTCGTCTG GCCGGACCAC GCTATCTGTG 2580 

GAGCGCCCGC CGCCGAGGCA AGACTCGGGC 264 0 

GCGGTAACCG GCCTCTTCAT CGGGAATGCG 2700 

TGGCGGACGG GAAGTATCAG CTCGACCAAG 2 76 0 

TAAAATGGAG AAAAAAATCA CTGGATATAC 2820 

AGAACATTTT GAGGCATTTC AGTCAGTTGC 28 8 0 

GCATTAATGA ATCGGCCAAC GCGCGGGGAG 2 94 0 

TTCCTCGCTC ACTGACTCGC TGCGCTCGGT 3000 



CGTTCGGCTG CGGCGAGCGG TATCAGCTCA 
ATCAGGGGAT AACGCAGGAA AGAACATGTG 
TAAAAAGGCC GCGTTGCTGG CGTTTTTCCA 
AAATCGACGC TCAAGTCAGA GGTGGCGAAA 
5 TCCCCCTGGA AGCTCCCTCG TGCGCTCTCC 
GTCCGCCTTT CTCCCTTCGG GAAGCGTGGC 
CAGTTCGGTG TAGGTCGTTC GCTCCAAGCT 
CGACCGCTGC GCCTTATCCG GTAACTATCG 
ATCGCCACTG GCAGCAGCCA CTGGTAACAG 
10 TACAGAGTTC TTGAAGTGGT GGCCTAACTA 
CTGCGCTCTG CTGAAGCCAG TTACCTTCGG 
ACAAACCACC GCTGGTAGCG GTGGTTTTTT 
AAAAGGATCT CAAGAAGATC CTTTGATCTT 
AAACTCACGT TAAGGGATTT TGGTCATGAG 
15 TTTAAATTAA AAATGAAGTT TTAAATCAAT 
CAGTTACCAA TGCTTAATCA GTGAGGCACC 
CATAGTTGCC TGATCCCCGT CGTGTAGATA 
CCCAGTGCTG CAATGATACC GCGAGACCCA 
AACCAGCCAG CCGGAAGGGC CGAGCGCAGA 
20 CAGTCTATTA ATTGTTGCCG GGAAGCTAGA 
AACGTTGTTG CCATTGCTAC AGGCATCGTG 
TTCAGCTCCG GTTCCCAACG ATCAAGGCGA 
GCGGTTAGCT CCTTCGGTCC TCCGATCGTT 
CTCATGGTTA TGGCAGCACT GCATAATTCT 
25 TCTGTGACTG GTGAGTACTC AACCAAGTCA 
TGCTCTTGCC CGGCGTCAAT ACGGGATAAT 
CTCATCATTG GAAAACGTTC TTCGGGGCGA 
TCCAGTTCGA TGTAACCCAC TCGTGCACCC 
AGCGTTTCTG GGTGAGCAAA AACAGGAAGG 
30 ACACGGAAAT GTTGAATACT CATACTCTTC 



CTCAAAGGCG GTAATACGGT TATCCACAGA 3060 

AGCAAAAGGC CAGCAAAAGG CCAGGAACCG 3120 

TAGGCTCCGC CCCCCTGACG AGCATCACAA 3180 

CCCGACAGGA CTATAAAGAT ACCAGGCGTT 324 0 

TGTTCCGACC CTGCCGCTTA CCGGATACCT 3 300 

GCTTTCTCAA TGCTCACGCT GTAGGTATCT 336 0 

GGGCTGTGTG CACGAACCCC CCGTTCAGCC 3420 

TCTTGAGTCC AACCCGGTAA GACACGACTT 34 80 

GATTAGCAGA GCGAGGTATG TAGGCGGTGC 3540 

CGGCTACACT AGAAGGACAG TATTTGGTAT 3600 

AAAAAGAGTT GGTAGCTCTT GATCCGGCAA 3660 

TGTTTGCAAG CAGCAGATTA CGCGCAGAAA 3 72 0 

TTCTACGGGG TCTGACGCTC AGTGGAACGA 3 780 

ATTATCAAAA AGGATCTTCA CCTAGATCCT 384 0 

CTAAAGTATA TATGAGTAAA CTTGGTCTGA 3 90 0 

TATCTCAGCG ATCTGTCTAT TTCGTTCATC 3 960 

ACTACGATAC GGGAGGGCTT ACCATCTGGC 4 020 

CGCTCACCGG CTCCAGATTT ATCAGCAATA 4 08 0 

AGTGGTCCTG CAACTTTATC CGCCTCCATC 414 0 

GTAAGTAGTT CGCCAGTTAA TAGTTTGCGC 4200 

GTGTCACGCT CGTCGTTTGG TATGGCTTCA 42 60 

GTTACATGAT CCCCCATGTT GTGCAAAAAA 432 0 

GTCAGAAGTA AGTTGGCCGC AGTGTTATCA 43 8 0 

CTTACTGTCA TGCCATCCGT AAGATGCTTT 4440 

TTCTGAGAAT AGTGTATGCG GCGACCGAGT 4 500 

ACCGCGCCAC ATAGCAGAAC TTTAAAAGTG 456 0 

AAACTCTCAA GGATCTTACCGCTGTTGAGA 462 0 

AACTGATCTT CAGCATCTTT TACTTTCACC 4680 

CAAAATGCCG CAAAAAAGGG AATAAGGGCG 474 0 

CTTTTTCAAT ATTATTGAAG CATTTATCAG 4800 



GGTTATTGTC TCATGAGCGG ATACATATTT GAATGTATTT AGAAAAATAA ACAAATAGGG 
GTTCCGCGCA CATTTCCCCG AAAAGTGCCA CCTGACGTCT AAGAAACCAT TATTATCATG 
ACATTAACCT ATAAAAATAG GCGTATCACG AGGCCCTTTC GTC 



4860 
4920 
4963 



5 (2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 



TCGAGTTTAC CACTCCCTAT CAGTGATAGA GAAAAGTGAA AG 



42 



